Making Problem Diagnosis Work for Large-Scale, Production Storage Systems

نویسندگان

  • Michael P. Kasick
  • Priya Narasimhan
  • Kevin Harms
چکیده

Intrepid has a very-large, production GPFS storage system consisting of 128 file servers, 32 storage controllers, 1152 disk arrays, and 11,520 total disks. In such a large system, performance problems are both inevitable and difficult to troubleshoot. We present our experiences, of taking an automated problem diagnosis approach from proof-of-concept on a 12-server test-bench parallel-filesystem cluster, and making it work on Intrepid’s storage system. We also present a 15-month case study, of problems observed from the analysis of 624GB of Intrepid’s instrumentation data, in which we diagnose a variety of performance-related storage-system problems, in a matter of hours, as compared to the days or longer with manual approaches. Tags: problem diagnosis, storage systems, infrastructure, case study.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Security-Constrained Unit Commitment Considering Large-Scale Compressed Air Energy Storage (CAES) Integrated With Wind Power Generation

Environmental concerns and depletion of nonrenewable resources has made great interest towards renewable energy resources. Cleanness and high potential are factors that caused fast growth of wind energy. However, the stochastic nature of wind energy makes the presence of energy storage systems (ESS) in wind integrated power systems, inevitable. Due to capability of being used in large-scale sys...

متن کامل

Evaluation and Prioritization of Criteria Affecting the Selection of Landscape Species, Using Multi-Criteria Decision-Making Systems

It is impractical to implement conservation efforts for all species due to complexity of natural systems, large scale of biodiversity issues, and budget limitations. Prioritizing species of conservation importance can alleviate this issue. Multiple interrelated criteria may be used for conservation prioritization of species. Therefore, the accurate evaluation of criteria is a multi-criteria dec...

متن کامل

A genetic algorithm approach for a dynamic cell formation problem considering machine breakdown and buffer storage

Cell formation problem mainly address how machines should be grouped and parts be processed in cells. In dynamic environments, product mix and demand change in each period of the planning horizon. Incorporating such assumption in the model increases flexibility of the system to meet customer’s requirements. In this model, to ensure the reliability of the system in presence of unreliable machine...

متن کامل

A New Compromise Decision-making Model based on TOPSIS and VIKOR for Solving Multi-objective Large-scale Programming Problems with a Block Angular Structure under Uncertainty

This paper proposes a compromise model, based on a new method, to solve the multi-objective large-scale linear programming (MOLSLP) problems with block angular structure involving fuzzy parameters. The problem involves fuzzy parameters in the objective functions and constraints. In this compromise programming method, two concepts are considered simultaneously. First of them is that the optimal ...

متن کامل

Scheduling Problem of Virtual Cellular Manufacturing Systems (VCMS); Using Simulated Annealing and Genetic Algorithm based Heuristics

In this paper, we present a simulated annealing (SA) and a genetic algorithm (GA) based on heuristics for scheduling problem of jobs in virtual cellular manufacturing systems. A virtual manufacturing cell (VMC) is a group of resources that is dedicated to the manufacturing of a part family. Although this grouping is not reflected in the physical structure of the manufacturing system, but machin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013